Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations

نویسندگان

  • H. Brendan McMahan
  • Francesco Orabona
چکیده

We study algorithms for online linear optimization in Hilbert spaces, focusing on the case where the player is unconstrained. We develop a novel characterization of a large class of minimax algorithms, recovering, and even improving, several previous results as immediate corollaries. Moreover, using our tools, we develop an algorithm that provides a regret bound ofO ( U √ T log(U √ T log T + 1) ) , where U is the L2 norm of an arbitrary comparator and both T and U are unknown to the player. This bound is optimal up to √ log log T terms. When T is known, we derive an algorithm with an optimal regret bound (up to constant factors). For both the known and unknown T case, a Normal approximation to the conditional value of the game proves to be the key analysis tool.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Early Stopping in Gradient Descent Learning

In this paper, we study a family of gradient descent algorithms to approximate the regression function from Reproducing Kernel Hilbert Spaces (RKHSs), the family being characterized by a polynomial decreasing rate of step sizes (or learning rate). By solving a bias-variance trade-off we obtain an early stopping rule and some probabilistic upper bounds for the convergence of the algorithms. Thes...

متن کامل

Regret bounds for Non Convex Quadratic Losses Online Learning over Reproducing Kernel Hilbert Spaces

We present several online algorithms with dimension-free regret bounds for general nonconvex quadratic losses by viewing them as functions in Reproducing Hilbert Kernel Spaces. In our work we adapt the Online Gradient Descent, Follow the Regularized Leader and the Conditional Gradient method meta algorithms for RKHS spaces and provide regret bounds in this setting. By analyzing them as algorith...

متن کامل

Coin Betting and Parameter-Free Online Learning

In the recent years, a number of parameter-free algorithms have been developed for online linear optimization over Hilbert spaces and for learning with expert advice. These algorithms achieve optimal regret bounds that depend on the unknown competitors, without having to tune the learning rates with oracle choices. We present a new intuitive framework to design parameter-free algorithms based o...

متن کامل

ADAPTIVE FILTERING IN REPRODUCING KERNEL HILBERT SPACES By WEIFENG LIU A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy ADAPTIVE FILTERING IN REPRODUCING KERNEL HILBERT SPACES By Weifeng Liu December 2008 Chair: Jose C. Principe Major: Electrical and Computer Engineering The theory of linear adaptive filters has reached maturity, unlike the field of nonli...

متن کامل

Regularized Policy Iteration with Nonparametric Function Spaces

We study two regularization-based approximate policy iteration algorithms, namely REGLSPI and REG-BRM, to solve reinforcement learning and planning problems in discounted Markov Decision Processes with large state and finite action spaces. The core of these algorithms are the regularized extensions of the Least-Squares Temporal Difference (LSTD) learning and Bellman Residual Minimization (BRM),...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014